Details of the Adjusted Rand index and Clustering algorithms

نویسندگان

  • Ka Yee Yeung
  • Walter L. Ruzzo
چکیده

D be the number of pairs of objects that are placed in the same class in and in the same cluster in , E be the number of pairs of objects in the same class in but not in the same cluster in , F be the number of pairs of objects in the same cluster in but not in the same class in , and G be the number of pairs of objects in different classes and different clusters in both partitions. The quantities D and G can be interpreted as agreements, and E and F as disagreements. The Rand index [Rand, 1971] is simply H IKJ

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی داده‌های بیان‌ژنی توسط عدم تشابه جنگل تصادفی

Background: The clustering of gene expression data plays an important role in the diagnosis and treatment of cancer. These kinds of data are typically involve in a large number of variables (genes), in comparison with number of samples (patients). Many clustering methods have been built based on the dissimilarity among observations that are calculated by a distance function. As increa...

متن کامل

Weighted Ensemble Clustering for Increasing the Accuracy of the Final Clustering

Clustering algorithms are highly dependent on different factors such as the number of clusters, the specific clustering algorithm, and the used distance measure. Inspired from ensemble classification, one approach to reduce the effect of these factors on the final clustering is ensemble clustering. Since weighting the base classifiers has been a successful idea in ensemble classification, in th...

متن کامل

Supplement to “Clustering Gene Expression Data with Repeated Measurements”

Cluster Accuracy: agreement with the functional categories Each entry shows the adjusted Rand index of the corresponding algorithm with the functional categories. The maximum adjusted Rand index of each row is shown in bold. The algorithms (rows) are sorted in descending order of the maximum adjusted Rand in each row. DIANA and single-link produce the least accurate clusters. *CAST did not conv...

متن کامل

Performance of an Ensemble Clustering Algorithm on Biological Data Sets

Ensemble clustering is a promising approach that combines the results of multiple clustering algorithms to obtain a consensus partition by merging different partitions based upon well-defined rules. In this study, we use an ensemble clustering approach for merging the results of five different clustering algorithms that are sometimes used in bioinformatics applications. The ensemble clustering ...

متن کامل

انتخاب اعضای ترکیب در خوشه‌بندی ترکیبی با استفاده از رأی‌گیری

Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001